ETSITS146 061 v/.o.o 



(2007-06) 



Technical Specification 



Digital cellular telecommunications system (Phase 2+); 

Substitution and muting of lost frames for 

Enhanced Full Rate (EFR) speech traffic channels 

(3GPP TS 46.061 version 7.0.0 Release 7) 



•25$ 



ES 




® 



GLOBAL SYSTEM FOR 
MOBILE COMMUNICATIONS 





3GPP TS 46.061 version 7.0.0 Release 7 1 ETSI TS 146 061 V7.0.0 (2007-06) 



Reference 



RTS/TSGS-0446061 v700 
Keywords 



GSM 



ETSI 

650 Route des Lucioles 
F-06921 Sophia Antipolis Cedex - FRANCE 

Tel. : +33 4 92 94 42 00 Fax: +33 4 93 65 47 1 6 

Siret N ° 348 623 562 0001 7 - NAF 742 C 
Association a but non lucratif enregistree a la 
Sous-Prefecture de Grasse (06) N° 7803/88 



Important notice 



Individual copies of the present document can be downloaded from: 
http://www.etsi.org 

The present document may be made available in more than one electronic version or in print. In any case of existing or 

perceived difference in contents between such versions, the reference version is the Portable Document Format (PDF). 

In case of dispute, the reference shall be the printing on ETSI printers of the PDF version kept on a specific network drive 

within ETSI Secretariat. 

Users of the present document should be aware that the document may be subject to revision or change of status. 

Information on the current status of this and other ETSI documents is available at 

http://portal.etsi.org/tb/status/status.asp 

If you find errors in the present document, please send your comment to one of the following services: 

http://portal.etsi.org/chaircor/ETSI support.asp 

Copyright Notification 

No part may be reproduced except as authorized by written permission. 
The copyright and the foregoing restriction extend to reproduction in all media. 

© European Telecommunications Standards Institute 2007. 
All rights reserved. 

DECT™, PLUGTESTS™ and UMTS™ are Trade Marks of ETSI registered for the benefit of its Members. 
TIPHON™ and the TIPHON logo are Trade Marks currently being registered by ETSI for the benefit of its Members. 
3GPP™ is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners. 
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Intellectual Property Rights 



IPRs essential or potentially essential to the present document may have been declared to ETSI. The information 
pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found 
in ETSI SR 000 314: "Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in 
respect of ETSI standards", which is available from the ETSI Secretariat. Latest updates are available on the ETSI Web 
server ( http://webapp.etsi.org/IPR/home.asp ). 

Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee 
can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the updates on the ETSI Web 
server) which are, or may be, or may become, essential to the present document. 



Foreword 

This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). 

The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or 
GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. 

The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under 
http ://webapp . etsi.org/kev/queryform. asp . 
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Foreword 



id , 



This Technical Specification has been produced by the 3 Generation Partnership Project (3GPP). 

The present document defines a frame substitution and muting procedure which is used by the Receive (RX) 
Discontinuous Transmission (DTX) handler when one or more lost speech or Silence Descriptor (SID) frames are 
received from the Radio Sub System (RSS) within the digital cellular telecommunications system. 

The contents of the present document are subject to continuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 

Version x.y.z 

where: 

x the first digit: 

1 presented to TSG for information; 

2 presented to TSG for approval; 

3 or greater indicates TSG approved document under change control. 

y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 

z the third digit is incremented when editorial only changes have been incorporated in the document. 
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Scope 



The present document defines a frame substitution and muting procedure which shall be used by the Receive (RX) 
Discontinuous Transmission (DTX) handler when one or more lost speech or Silence Descriptor (SID) frames are 
received from the Radio Sub System (RSS). 

The requirements of the present document are mandatory for implementation in all GSM Base Station Systems (BSS)s 
and Mobile Stations (MS)s capable of supporting the enhanced Full Rate speech traffic channel. It is not mandatory to 
follow the bit exact implementation outlined in the present document and the corresponding C-source code. 



References 



The following documents contain provisions which, through reference in this text, constitute provisions of the present 
document. 

• References are either specific (identified by date of publication, edition number, version number, etc.) or 
non-specific. 

• For a specific reference, subsequent revisions do not apply. 

• For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a 
GSM document), a non-specific reference implicitly refers to the latest version of that document in the same 
Release as the present document. 

[1] GSM 05.03: "Digital cellular telecommunications system (Phase 2+); Channel coding". 

[2] GSM 06.60: "Digital cellular telecommunications system (Phase 2+); Enhanced Full Rate (EFR) 

speech transcoding" . 

[3] GSM 06.81: "Digital cellular telecommunications system (Phase 2+); Discontinuous transmission 

(DTX) for Enhanced Full Rate (EFR) speech traffic channels". 

[4] GSM 08.60: "Digital cellular telecommunications system (Phase 2+); Inband control of remote 

transcoders and rate adaptors for Enhanced Full Rate (EFR) and full rate traffic channels". 



3 Definitions and abbreviations 

3.1 Definitions 

For the purposes of the present document, the following term and definition applies: 

5-point median operation: consists of sorting the 5 elements belonging to the set for which the median operation is to 
be performed in an ascending order according to their values, and selecting the third largest value of the sorted set as the 
median value. 

Further definitions of terms used in the present document can be found in GSM 06.60 [2], GSM 06.81 [3], 
GSM 05.03 [1] and GSM 08.60 [4]. 

3.2 Abbreviations 

For the purposes of the present document, the following abbreviations apply: 

BFI Bad Frame Indication from Radio Sub System 

BSI_Abis Bad Sub-block Indication obtained from A-bis CRC checks 

CCU Channel Coding Unit 

CRC Cyclic Redundancy Check 
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DTX Discontinuous transmission 

median5 5 -point median operation 

PrevBFI Bad Frame Indication of Previous frame 

RSS Radio Sub System 

RX Receive 

SID Silence Descriptor frame 

TRAU Transcoding Rate Adaptation Unit 



General 



The purpose of frame substitution is to conceal the effect of lost frames. The purpose of muting the output in the case of 
several lost frames is to indicate the breakdown of the channel to the user and to avoid generating possible annoying 
sounds as a result from frame substitution procedure. 

The RSS indicates lost speech or SID frames by setting its Bad Frame Indication flag (BFI) based on its 3-bit and 8-bit 
CRCs and possibly other error detection mechanisms. The TRAU calculates from the CRCs inserted by the CCU in the 
TRAU frames one BSI_Abis flag for every sub-block of speech parameters. If either one or more of these flags is set, 
the speech decoder shall either perform frame substitution or subframe substitution. 

The example solution provided in clause 6 applies only for bad frame handling on a complete speech frame basis. 
However some parts could be modified for substitution of bad sub-blocks. 



Requirements 



5.1 



Error detection 



An error is detected and the BFI-flag is set-by the RSS according to the principle described in clause 4. 



5.2 Lost speech frames 



Normal decoding of lost speech frames would result in very unpleasant noise effects. In order to improve the subjective 
quality, lost speech frames shall be substituted with either a repetition or an extrapolation of the previous good speech 
frame(s). This substitution is done so that it will gradually decrease the output level, resulting in silencing of the output. 
Clause 6.1 gives an example solution. 



5.3 



First lost SID frame 



A single lost SID frame shall be substituted by the last valid SID frame and the procedure for valid SID frames be 
applied as described in GSM 06.81 [3]. 



5.4 Subsequent lost SID frames 



For the second lost SID frame, a muting technique shall be used on the comfort noise that will gradually decrease the 
output level (-3 dB/frame), resulting in silencing of the output of the decoder. 

For subsequent lost SID frames, the muting of the output shall be maintained. Clause 6.2 gives an example solution. 



Example solution 



The C-code of the following example is embedded in the bit exact software of the enhanced full rate codec. 
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6.1 Example solution for substitution and muting of lost speech 
frames 

This example solution for substitution and muting is based on a state machine with seven states (figure 1). 

The system starts in state 0. Each time a bad frame is detected, the state counter is incremented by one and is saturated 
when it reaches 6. Each time a good speech frame is detected, the state counter is reset to zero, except when we are in 
state 6, where we set the state counter to 5. The state indicates the quality of the channel: the bigger the state counter, 
the worse the channel quality is. The control flow of the state machine can be described with the following C-code (BFI 
= bad frame indicator, State = state variable): 

if (BFI != o ) 

State = State + 1; 
else if (State == 6) 

State = 5; 
else 

State = 0; 
if (State > 6 ) 

State = 6; 

In addition to this state machine, the Bad Frame Flag from the previous frame is checked (PrevBFI). The processing 
depends on the value of the State- variable. In states and 5, the processing depends also on the two flags BFI and 
PrevBFI. 
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The procedure can be described as follows: 



STATE = 

BFI = 
PrevBFI = or 1 



I 



AAA Am 



STATE = 1 

BFI=1 

PrevBFI = 



I 



STATE = 2 

BFI = 1 

PrevBFI = 1 



I 



STATE = 3 

BFI=1 

PrevBFI = 1 



STATE = 4 

BFI=1 

PrevBFI = 1 



STATE = 5 

BFI = or 1 
PrevBFI = 1 



STATE = 6 

BFI=1 
PrevBFI = or 1 



n 



-►Bad frame (BFI=1) 
-> Good frame (BFI=0) 



Figure 1 : State machine for controlling the bad frame substitution 

BFI = 0, PrevBFI = 0, State = 

No error is detected in the received or in the previous received speech frame. The received speech parameters are used 
normally in the speech synthesis. The current frame of speech parameters is saved. 

BFI = 0, PrevBFI = 1, State = or 5 

No error is detected in the received speech frame but the previous received speech frame was bad. The LTP-gain and 
fixed codebook gain are limited below the values used for the last received good subframe: 



p= \g P , g"<g P (-l) 

8 ~V(-i), g p > g p (-i) 

where g p = current decoded LTP-gain, g p (— 1) = LTP-gain used for the last good subframe (BFI = 0), and 

e= IY, g°<g C (-l) 

8 U e (-i), g c >g c (-i) 



(1) 



(2) 
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where g c = current decoded fixed codebook-gain and ^ c ( — 1) = fixed codebook gain used for the last good subframe 
(BFI = 0). 

The rest of the received speech parameters are used normally in the speech synthesis. The current frame of speech 
parameters is saved. 

BFI = 1, PrevBFI = or 1, State = 1...6 

An error is detected in the received speech frame and the substitution and muting procedure is started. The LTP-gain 
and fixed codebook gain are replaced by attenuated values from the previous subframes: 

P(state) g p (-l), g p (-l)<median5(g p (-l),...,g p (-5)) 

P(state) median5(g p (-l),...,g p (-5)\ g p (-l) > median5(g p (-l),...,g p (-5)) 

where g p = current decoded LTP-gain, g p (—1), . . . , g p (—n) = LTP-gains used for the last n subframes, 
median5() = 5-point median operation, P(state) = attenuation factor (P(l) = 0.98, P(2) = 0.98, P(3) = 0.8, P(4) = 0.3, 
P(5) = 0.2, P(6) = 0.2), state = state number, and 

\C(state) g c (-\), g c (-\) < median5(g c (-l), ..., g c (-5)) 

C(state) median5(g € (-1), ..., g c (-5)), g c (-l) > median5(g € (-\), ..., g c (-5)) 

(4) 

where g c = current decoded fixed codebook gain, g € (—Y), ..., g c (—fl) = fixed codebook gains used for the last n 

subframes, median5() = 5-point median operation, C(state) = attenuation factor (C(l) = 0.98, C(2) = 0.98, C(3) = 0.98, 
C(4) = 0.98, C(5) = 0.98, C(6) = 0.7), and state = state number. 

The higher the state value is, the more the gains are attenuated. Also the memory of the predictive fixed codebook gain 
is updated by using the average value of the past four values in the memory: 

1 4 
ener(0) = —^enerl-i) (5) 

The past LSFs are used by shifting their values towards their mean: 

Isf _ q\{i) = hf _ q2(i) = a past_ lsf_ q{i) + (1 - a)mean_ lsf(i), i = 0. . .9 



(6) 

where a = 0.95, lsf_ql and lsf_q2 are two sets of LSF-vectors for current frame, past_lsf_q is lsf_q2 from the previous 
frame, and mean_lsf is the average LSF-vector. 

The LTP-lag values are replaced by the past value from the 4th subframe of the previous frame. 

The received fixed codebook excitation pulses from the erroneous frame are always used as such. 

6.2 Example solution for substitution and muting of lost SID 
frames 

The first lost SID frame is replaced by the last valid SID frame. 

For subsequent lost SID frames, the last valid SID frame is repeated, but the fixed codebook gain is decreased with a 
constant value of -3 dB in each frame down to the minimum value of 0. This value is maintained if additional lost SID 
frames occur. 
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